Handling Missing Values via a Neural Selective Input Model

نویسندگان

  • Noel Lopes
  • Bernardete Ribeiro
چکیده

Missing data represent an ubiquitous problem with numerous and diverse causes. Handling Missing Values (MVs) properly is a crucial issue, in particular in Machine Learning (ML) and pattern recognition. To date, the only option available for standard Neural Networks (NNs) to handle this problem has been to rely on pre-processing techniques such as imputation for estimating the missing data values, which limited considerably the scope of their application. To circumvent this limitation we propose a Neural Selective Input Model (NSIM) that accommodates different transparent and bound models, while providing support for NNs to handle MVs directly. By embedding the mechanisms to support MVs we can obtain better models that reflect the uncertainty caused by unknown values. Experiments on several UCI datasets with both different distributions and proportion of MVs show that the NSIM approach is very robust and yields good to excellent results. Furthermore, the NSIM performs better than the state-of-theart imputation techniques either with higher prevalence of MVs in a large number of features or with a significant proportion of MVs, while delivering competitive performance in the remaining cases. We demonstrate the usefulness and validity of the NSIM, making this a first-class method for dealing with this problem.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Missing Values in a Backpropogation Neural Net

An empirical study of methods of handling missing values in a backpropagation neural network is presented. Neural networks can be applied to many real world systems to perform classification, pattern recognition or prediction on the basis of input data. However, many such applications cannot guarantee that the data provided to the network will be complete. The backpropagation network does not l...

متن کامل

Recurrent Neural Networks for Missing orAsynchronous

In this paper we propose recurrent neural networks with feedback into the input units for handling two types of data analysis problems. On the one hand, this scheme can be used for static data when some of the input variables are missing. On the other hand, it can also be used for sequential data, when some of the input variables are missing or are available at diierent frequencies. Unlike in t...

متن کامل

Investigating the missing data effect on credit scoring rule based models: The case of an Iranian bank

Credit risk management is a process in which banks estimate probability of default (PD) for each loan applicant. Data sets of previous loan applicants are built by gathering their data, and these internal data sets are usually completed using external credit bureau’s data and finally used for estimating PD in banks. There is also a continuous interest for bank to use rule based classifiers to b...

متن کامل

Recurrent Neural Networks for Missing or Asynchronous Data

In this paper we propose recurrent neural networks with feedback into the input units for handling two types of data analysis problems On the one hand this scheme can be used for static data when some of the input variables are missing On the other hand it can also be used for sequential data when some of the input variables are missing or are available at di erent frequencies Unlike in the cas...

متن کامل

Handling missing values in support vector machine classifiers

This paper discusses the task of learning a classifier from observed data containing missing values amongst the inputs which are missing completely at random. A non-parametric perspective is adopted by defining a modified risk taking into account the uncertainty of the predicted outputs when missing values are involved. It is shown that this approach generalizes the approach of mean imputation ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014